
Data-Driven Forecasting of Residential Water, Heating and Electricity Consumption
Please login to view abstract download link
The present study deals with machine learning algorithms to forecast the short-term energy consumption of 41 apartments in a multi-residential building equipped with smart meters. The predictions are made each week, one week ahead with hour granularity for heating, electricity, HVAC, and cold/hot water with optional calendar and weather covariates. In future research, this data-driven virtual model of consumption aims to be integrated into a hybrid digital twin model of the building along with the physics-based model developed by Cenaero. The first step in this study was to account for difficulties linked with a real-world dataset: the saw-tooth behavior in consumption signals due to the unitary impulsion step of the smart meters and the missing data. The saw-tooth behavior was tackled by linear interpolation whereas simple but effective median consumption profiles were selected for missing values imputation. A second step was to account for high disturbances due to changes in residents based on a sliding window linear regression strategy to spot these changes and split the training dataset accordingly. As pointed out in the review, highly complex methods are not necessarily the best forecasters, hence, a benchmark of forecasters with increasing complexities was established in this study going from naïve, statistical, and simple machine learning forecasters like moving average, exponential smoothing, Prophet, linear regressor and gradient boosting to deep learning forecasters like TiDE. A hyperparameters optimization was performed with the Optuna framework including covariates options, as well as weight decay/L2 regularization and several learning rate scheduler strategies. This demonstrated the interest in using external temperature to help heating prediction and using regularization and scheduler techniques to account for unstable training due to high stochasticity in residential data. Except for HVAC forecasting for which a naïve moving average algorithm yielded the best results, Ridge linear regressor and TiDE emerged as the most promising methods showing quick adaptation to changes in residents in the test dataset after one week. Current work now also focuses on medium-term forecasts and enriching the benchmark with other forecasting methods. Further perspectives include probabilistic forecasting, clustering of consumption profiles, and coupling with the physics-based model.